67 research outputs found
Essays on macroeconomics with microeconomic heterogeneity
This dissertation consists of three essays on macroeconomics with microeconomic heterogeneity.
In Chapter 1, I empirically investigate the extent to which regional economic activity responds to fiscal shocks. Exploiting state-level variation of military procurement, I apply an instrumental variable local projection method extended to the panel data context to estimate the dynamic causal effects of a military spending shock. These estimates, which are referred to as “regional impulse responses,” indicate three main empirical findings. First, regional output displays a large and lengthy response over a decade to a regional military spending shock, despite the fact that military spending has returned to a normal level after five years. Second, regional population gradually grows over the decade after the shock. Third, the response of construction to military spending is proportionately much larger than that of total output and also represents an important share of overall output responses. This evidence suggests that labor reallocation across regions can be very important for the impact of fiscal policy.
Chapter 2 quantitatively analyzes the regional and aggregate implications of the empirical findings in the previous chapter. I first study a simple model of regional reallocation to build intuition. Then I develop a multi-region New Keynesian model with labor migration and housing construction and calibrate a U.S. economy with 51 regions. The model reveals that labor reallocation amplifies regional output through a boom in construction spending and amplifies aggregate output through a positive “covariance effect” arising from net directed migration towards booming regions where population and regional output per resident are rising simultaneously. To circumvent high dimensionality, I propose a new method to tractably solve spatial dynamic stochastic general equilibrium (DSGE) models. Using this method, I quantitatively find that in response to a national military buildup that affects regions differentially in a manner consistent with U.S. expenditure, labor reallocation amplifies the aggregate output effect of government spending by 30 percent relative to a model without it.
Chapter 3 proposes a new method to study the macroeconomic impact of microeconomic shocks. I show in what conditions and how to represent a disaggregate stochastic dynamic model into a recursive aggregate system by approximation order. This method provides a sufficient-statistic characterization of “macro state variables or shocks” in terms of heterogeneous micro shocks and structures. The first- and second-order macro shocks can be shaped by the average and the dispersion of the micro counterparts weighted by their micro impact intensities. I apply this method in several applications to illustrate the importance of micro heterogeneity and nonlinearity in macroeconomics. First, I provide a first-order decomposition of the aggregate consumption function when consumers have different marginal propensities to consume. A new redistribution channel—the asset position adjustment channel—is identified and the magnitude of this channel relies on the variability of capital investment and the persistence of shocks. Second, I show that permanent income inequality alters aggregate demand responses mainly at the second-order, instead of the first-order. Non-homothetic preferences and the dispersion of permanent income income jointly determine the impart on aggregate consumption response
WHC: Weighted Hybrid Criterion for Filter Pruning on Convolutional Neural Networks
Filter pruning has attracted increasing attention in recent years for its
capacity in compressing and accelerating convolutional neural networks. Various
data-independent criteria, including norm-based and relationship-based ones,
were proposed to prune the most unimportant filters. However, these
state-of-the-art criteria fail to fully consider the dissimilarity of filters,
and thus might lead to performance degradation. In this paper, we first analyze
the limitation of relationship-based criteria with examples, and then introduce
a new data-independent criterion, Weighted Hybrid Criterion (WHC), to tackle
the problems of both norm-based and relationship-based criteria. By taking the
magnitude of each filter and the linear dependence between filters into
consideration, WHC can robustly recognize the most redundant filters, which can
be safely pruned without introducing severe performance degradation to
networks. Extensive pruning experiments in a simple one-shot manner demonstrate
the effectiveness of the proposed WHC. In particular, WHC can prune ResNet-50
on ImageNet with more than 42% of floating point operations reduced without any
performance loss in top-5 accuracy.Comment: Accepted by ICASSP 202
Stochastic Bridges as Effective Regularizers for Parameter-Efficient Tuning
Parameter-efficient tuning methods (PETs) have achieved promising results in
tuning large pre-trained language models (PLMs). By formalizing frozen PLMs and
additional tunable parameters as systems and controls respectively, PETs can be
theoretically grounded to optimal control and further viewed as optimizing the
terminal cost and running cost in the optimal control literature. Despite the
elegance of this theoretical grounding, in practice, existing PETs often ignore
the running cost and only optimize the terminal cost, i.e., focus on optimizing
the loss function of the output state, regardless of the running cost that
depends on the intermediate states. Since it is non-trivial to directly model
the intermediate states and design a running cost function, we propose to use
latent stochastic bridges to regularize the intermediate states and use the
regularization as the running cost of PETs. As the first work to propose
regularized PETs that use stochastic bridges as the regularizers (running
costs) for the intermediate states, we show the effectiveness and generality of
this regularization across different tasks, PLMs and PETs. In view of the great
potential and capacity, we believe more sophisticated regularizers can be
designed for PETs and better performance can be achieved in the future. The
code is released at
\url{https://github.com/thunlp/stochastic-bridge-pet/tree/main}.Comment: ACL 2023 Finding
Communicative Agents for Software Development
Software engineering is a domain characterized by intricate decision-making
processes, often relying on nuanced intuition and consultation. Recent
advancements in deep learning have started to revolutionize software
engineering practices through elaborate designs implemented at various stages
of software development. In this paper, we present an innovative paradigm that
leverages large language models (LLMs) throughout the entire software
development process, streamlining and unifying key processes through natural
language communication, thereby eliminating the need for specialized models at
each phase. At the core of this paradigm lies ChatDev, a virtual chat-powered
software development company that mirrors the established waterfall model,
meticulously dividing the development process into four distinct chronological
stages: designing, coding, testing, and documenting. Each stage engages a team
of agents, such as programmers, code reviewers, and test engineers, fostering
collaborative dialogue and facilitating a seamless workflow. The chat chain
acts as a facilitator, breaking down each stage into atomic subtasks. This
enables dual roles, allowing for proposing and validating solutions through
context-aware communication, leading to efficient resolution of specific
subtasks. The instrumental analysis of ChatDev highlights its remarkable
efficacy in software generation, enabling the completion of the entire software
development process in under seven minutes at a cost of less than one dollar.
It not only identifies and alleviates potential vulnerabilities but also
rectifies potential hallucinations while maintaining commendable efficiency and
cost-effectiveness. The potential of ChatDev unveils fresh possibilities for
integrating LLMs into the realm of software development.Comment: 25 pages, 9 figures, 2 table
Mind's Mirror: Distilling Self-Evaluation Capability and Comprehensive Thinking from Large Language Models
Large language models (LLMs) have achieved remarkable advancements in the
field of natural language processing. However, the sheer scale and
computational demands of these models present formidable challenges when
considering their practical deployment in resource-constrained contexts. While
techniques such as chain-of-thought (CoT) distillation have displayed promise
in distilling LLMs into small language models (SLMs), there is a risk that
distilled SLMs may still carry over flawed reasoning or hallucinations
inherited from their LLM counterparts. To address these issues, we propose a
twofold methodology: First, we introduce a novel method for distilling the
self-evaluation capability inherent in LLMs into SLMs, which aims to mitigate
the adverse effects of erroneous reasoning and reduce hallucinations. Second,
we advocate for a comprehensive distillation process that incorporates multiple
distinct chain-of-thought and self-evaluation paradigms and ensures a more
holistic and robust knowledge transfer into SLMs. Experiments on three NLP
benchmarks demonstrate that our method significantly improves the performance
of distilled SLMs and sheds light on the path towards developing smaller models
closely aligned with human cognition.Comment: 13 pages, 5 figure
ChatEval: Towards Better LLM-based Evaluators through Multi-Agent Debate
Text evaluation has historically posed significant challenges, often
demanding substantial labor and time cost. With the emergence of large language
models (LLMs), researchers have explored LLMs' potential as alternatives for
human evaluation. While these single-agent-based approaches show promise,
experimental results suggest that further advancements are needed to bridge the
gap between their current effectiveness and human-level evaluation quality.
Recognizing that best practices of human evaluation processes often involve
multiple human annotators collaborating in the evaluation, we resort to a
multi-agent debate framework, moving beyond single-agent prompting strategies.
The multi-agent-based approach enables a group of LLMs to synergize with an
array of intelligent counterparts, harnessing their distinct capabilities and
expertise to enhance efficiency and effectiveness in handling intricate tasks.
In this paper, we construct a multi-agent referee team called ChatEval to
autonomously discuss and evaluate the quality of generated responses from
different models on open-ended questions and traditional natural language
generation (NLG) tasks. Our analysis shows that ChatEval transcends mere
textual scoring, offering a human-mimicking evaluation process for reliable
assessments. Our code is available at https://github.com/chanchimin/ChatEval
Boosting Inference Efficiency: Unleashing the Power of Parameter-Shared Pre-trained Language Models
Parameter-shared pre-trained language models (PLMs) have emerged as a
successful approach in resource-constrained environments, enabling substantial
reductions in model storage and memory costs without significant performance
compromise. However, it is important to note that parameter sharing does not
alleviate computational burdens associated with inference, thus impeding its
practicality in situations characterized by limited stringent latency
requirements or computational resources. Building upon neural ordinary
differential equations (ODEs), we introduce a straightforward technique to
enhance the inference efficiency of parameter-shared PLMs. Additionally, we
propose a simple pre-training technique that leads to fully or partially shared
models capable of achieving even greater inference acceleration. The
experimental results demonstrate the effectiveness of our methods on both
autoregressive and autoencoding PLMs, providing novel insights into more
efficient utilization of parameter-shared models in resource-constrained
settings.Comment: EMNLP 2023 Finding
AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors in Agents
Autonomous agents empowered by Large Language Models (LLMs) have undergone
significant improvements, enabling them to generalize across a broad spectrum
of tasks. However, in real-world scenarios, cooperation among individuals is
often required to enhance the efficiency and effectiveness of task
accomplishment. Hence, inspired by human group dynamics, we propose a
multi-agent framework \framework that can collaboratively and dynamically
adjust its composition as a greater-than-the-sum-of-its-parts system. Our
experiments demonstrate that \framework framework can effectively deploy
multi-agent groups that outperform a single agent. Furthermore, we delve into
the emergence of social behaviors among individual agents within a group during
collaborative task accomplishment. In view of these behaviors, we discuss some
possible strategies to leverage positive ones and mitigate negative ones for
improving the collaborative potential of multi-agent groups. Our codes for
\framework will soon be released at
\url{https://github.com/OpenBMB/AgentVerse}.Comment: Work in progres
- …